CS224W - Bag of Tricks for Node Classification with GNN - GAT Normalization #9840

liuvince · 2024-12-10T20:54:25Z

Add normalize parameter to GATConv and GATv2Conv.

Part of #9831 for our final project for the Stanford CS224W course, this allows "GAT with Symmetric Normalized Adjacency Matrix" as described in “Bag of Tricks for Node Classification with Graph Neural Networks”.

Details

Implementation of gat_norm inspired from gcn_norm, when edge_index is a SparseTensor, is_torch_sparse_tensor or dense torch Tensor.
gat_norm is called after computing the alpha coefficients and return the updated values of edge_index and alpha. The outputs of gat_norm are passed as inputs of self.propagate.
Update the docstring of GATConv and GATv2Conv.
Add unit test cases.
Override the add_self_loops parameter. We remove self loops from the initial graph before calling to gat_norm and add self loops with normalization in gat_norm as described in the paper. We tried to use the tools already provided in the library such as torch_sparse.fill_diag, to_edge_index, add_remaining_self_loops, add_self_loops and to_torch_csr_tensor.
One concern is that there is no learned weight regardless of add_self_loops, because we explicitly remove self loops before edge update. This is consistent with the paper's description and gcn_norm, but different from the paper's implementation. Also, it seems that they use both out-degree and in-degree. We would appreciate your feedback on the preferred approach.
When is_torch_sparse_tensor(edge_index) == True, we have an issue formatting back the index edge_index and the corresponding values in att_mat to the appropriate format. Our workaround consists of sorting lexicographically the values of att_mat, so it matches the index of edge_index for the propagate and update subsequent steps.
When isinstance(edge_index, SparseTensor) and in the case we have multiple heads, e.g. num_heads > 1, we need to perform the operation $D \alpha$, which is a multiplication of SparseTensor (with values dimension greater than 1) and degree matrix. Our solution is based on repeating the degree matrix num_heads times. We don't use repeat_interleave directly as we encounter the following error: "repeat_interleave_cpu" not implemented for 'Float', but the current implementation should follow the same behavior.
Only support non-bipartite graph mesasge passing.

Benchmarks

I have the following metrics with one T4 GPU, so it performs better for CiteSeer and PubMed dataset with a computation time cost.

dataset	Test Accuracy	Test Accuracy (with GAT Norm)	Duration	Duration (with GAT Norm)
Cora	0.831 ± 0.004	0.825 ± 0.005	4.296s	5.172s
CiteSeer	0.707 ± 0.005	0.715 ± 0.005	4.767s	5.592s
PubMed	0.789 ± 0.003	0.796 ± 0.004	6.603s	7.204s

with the following run commands:

python gat.py --dataset=Cora
python gat.py --dataset=Cora --normalize

python gat.py --dataset=CiteSeer
python gat.py --dataset=CiteSeer --normalize  

python gat.py --dataset=PubMed --lr=0.01 --output_heads=8 --weight_decay=0.001
python gat.py --dataset=PubMed --lr=0.01 --output_heads=8 --weight_decay=0.001 --normalize

…sage passing yet and add num_nodes as parameters

codecov · 2024-12-11T20:30:58Z

Codecov Report

Attention: Patch coverage is 93.06931% with 7 lines in your changes missing coverage. Please review.

Project coverage is 86.36%. Comparing base (1519e9f) to head (ce302c4).
Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
torch_geometric/typing.py	0.00%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #9840      +/-   ##
==========================================
+ Coverage   85.29%   86.36%   +1.06%     
==========================================
  Files         478      490      +12     
  Lines       31918    32386     +468     
==========================================
+ Hits        27225    27969     +744     
+ Misses       4693     4417     -276

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

liuvince requested review from a team, wsad1, EdisonLeeeee and rusty1s as code owners December 10, 2024 20:54

liuvince changed the title ~~Gat normalization~~ CS224W - Bag of Tricks for Node Classification with GNN - GAT Normalization Dec 10, 2024

Vincenzooo added 11 commits December 11, 2024 20:58

Use unnormalized attention matrix for GAT

4c9a384

Update GAT norm with torch_csr_tensor

2cfaefc

Add full test to GAT Conv with normalization

d2a52a0

Add normalization to GATv2Conv

45f63fd

Add GAT normalization parameter in benchmark/citation/gat.py

8c05677

Fix typo

beb2e34

Remove unused variable and passing more parameters to gat_norm

f346c8c

Outline the fact that GAT Normalization does not handle bipartite mes…

8bbdbc4

…sage passing yet and add num_nodes as parameters

Checking self.normalize once

d22db1d

Update test condition to check bipartite graph in forward pass

c19ada3

Fix gat_norm when edge_index is SparseTensor

76cab5d

liuvince force-pushed the gat_normalization branch from 0ff800b to 76cab5d Compare December 11, 2024 19:59

Add small fixes

ce302c4

Vincenzooo added 4 commits December 11, 2024 21:50

Clean typing and add typing test

00c73d8

Update CHANGELOG.md

6197e8f

Update tests

9c5c19e

Update flow

d4864ee

akihironitta added feature nn cs224w labels Dec 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CS224W - Bag of Tricks for Node Classification with GNN - GAT Normalization #9840

CS224W - Bag of Tricks for Node Classification with GNN - GAT Normalization #9840

liuvince commented Dec 10, 2024 •

edited

Loading

codecov bot commented Dec 11, 2024

CS224W - Bag of Tricks for Node Classification with GNN - GAT Normalization #9840

Are you sure you want to change the base?

CS224W - Bag of Tricks for Node Classification with GNN - GAT Normalization #9840

Conversation

liuvince commented Dec 10, 2024 • edited Loading

Details

Benchmarks

codecov bot commented Dec 11, 2024

Codecov Report

liuvince commented Dec 10, 2024 •

edited

Loading